## # A tibble: 6 × 53
##   STATE ST_CASE VE_TOTAL VE_FORMS PVH_INVL  PEDS PERNOTMVIT PERMVIT
##   <int>   <int>    <int>    <int>    <int> <int>      <int>   <int>
## 1     1   10001        1        1        0     0          0       1
## 2     1   10002        1        1        0     0          0       1
## 3     1   10003        1        1        0     0          0       2
## 4     1   10004        1        1        0     0          0       1
## 5     1   10005        2        2        0     0          0       2
## 6     1   10006        1        1        0     0          0       2
## # ... with 45 more variables: PERSONS <int>, COUNTY <int>, CITY <int>,
## #   DAY <int>, MONTH <int>, YEAR <int>, DAY_WEEK <int>, HOUR <int>,
## #   MINUTE <int>, NHS <int>, RUR_URB <int>, FUNC_SYS <int>,
## #   RD_OWNER <int>, ROUTE <int>, TWAY_ID <chr>, TWAY_ID2 <chr>,
## #   MILEPT <int>, LATITUDE <dbl>, LONGITUD <dbl>, SP_JUR <int>,
## #   HARM_EV <int>, MAN_COLL <int>, RELJCT1 <int>, RELJCT2 <int>,
## #   TYP_INT <int>, WRK_ZONE <int>, REL_ROAD <int>, LGT_COND <int>,
## #   WEATHER1 <int>, WEATHER2 <int>, WEATHER <int>, SCH_BUS <int>,
## #   RAIL <chr>, NOT_HOUR <int>, NOT_MIN <int>, ARR_HOUR <int>,
## #   ARR_MIN <int>, HOSP_HR <int>, HOSP_MN <int>, CF1 <int>, CF2 <int>,
## #   CF3 <int>, FATALS <int>, DRUNK_DR <int>, DRUNK <lgl>
## # A tibble: 6 × 102
##   STATE ST_CASE VEH_NO VE_FORMS NUMOCCS   DAY MONTH  HOUR MINUTE HARM_EV
##   <int>   <int>  <int>    <int>   <int> <int> <int> <int>  <int>   <int>
## 1     1   10001      1        1       1     1     1     2     40      35
## 2     1   10002      1        1       1     1     1    22     13      34
## 3     1   10003      1        1       2     1     1     1     25      42
## 4     1   10004      1        1       1     4     1     0     57      53
## 5     1   10005      1        2       1     7     1     7      9      12
## 6     1   10005      2        2       1     7     1     7      9      12
## # ... with 92 more variables: MAN_COLL <int>, UNITTYPE <int>,
## #   HIT_RUN <int>, REG_STAT <int>, OWNER <int>, MAKE <int>, MODEL <int>,
## #   MAK_MOD <int>, BODY_TYP <int>, MOD_YEAR <int>, VIN <chr>, VIN_1 <chr>,
## #   VIN_2 <chr>, VIN_3 <chr>, VIN_4 <chr>, VIN_5 <chr>, VIN_6 <chr>,
## #   VIN_7 <chr>, VIN_8 <chr>, VIN_9 <chr>, VIN_10 <chr>, VIN_11 <chr>,
## #   VIN_12 <chr>, TOW_VEH <int>, J_KNIFE <int>, MCARR_I1 <int>,
## #   MCARR_I2 <chr>, MCARR_ID <chr>, GVWR <int>, V_CONFIG <int>,
## #   CARGO_BT <int>, HAZ_INV <int>, HAZ_PLAC <int>, HAZ_ID <int>,
## #   HAZ_CNO <int>, HAZ_REL <int>, BUS_USE <int>, SPEC_USE <int>,
## #   EMER_USE <int>, TRAV_SP <int>, UNDERIDE <int>, ROLLOVER <int>,
## #   ROLINLOC <int>, IMPACT1 <int>, DEFORMED <int>, TOWED <int>,
## #   M_HARM <int>, VEH_SC1 <int>, VEH_SC2 <int>, FIRE_EXP <int>,
## #   DR_PRES <int>, L_STATE <int>, DR_ZIP <int>, L_STATUS <int>,
## #   L_TYPE <int>, CDL_STAT <int>, L_ENDORS <int>, L_COMPL <int>,
## #   L_RESTRI <int>, DR_HGT <int>, DR_WGT <int>, PREV_ACC <int>,
## #   PREV_SUS <int>, PREV_DWI <int>, PREV_SPD <int>, PREV_OTH <int>,
## #   FIRST_MO <int>, FIRST_YR <int>, LAST_MO <int>, LAST_YR <int>,
## #   SPEEDREL <int>, DR_SF1 <int>, DR_SF2 <int>, DR_SF3 <int>,
## #   DR_SF4 <int>, VTRAFWAY <int>, VNUM_LAN <int>, VSPD_LIM <int>,
## #   VALIGN <int>, VPROFILE <int>, VPAVETYP <int>, VSURCOND <int>,
## #   VTRAFCON <int>, VTCONT_F <int>, P_CRASH1 <int>, P_CRASH2 <int>,
## #   P_CRASH3 <int>, PCRASH4 <int>, PCRASH5 <int>, ACC_TYPE <int>,
## #   DEATHS <int>, DR_DRINK <int>

How many fatalities occure per accident?

How many vehicles are involved in accidents with fatalities?

1-3 vehicles involved is so dominat that we cant even see that there are some rare large pileups ranging all the way to 58 cars involved!

## 
##     1     2     3     4     5     6     7     8     9    10    11    12 
## 18070 11616  1815   418   119    61    30    14    10     1     2     2 
##    14    16    19    22    29    31    58 
##     1     1     2     1     1     1     1

What time of day do accidents occur?

Does it change from month to month?

What about by week?

When are drunk driver incidents?

Where are crashes occuring?

Alright the data looks like a smashed map, and if we check the data guide we see:

LONGITUD Meaning
DDD.DDDD Actual Degrees
777.7777 Not Reported
888.8888 Not Available (If State Exempt)
999.9999 Unknown

So we can drop any longitude greater than 0 since valid US locations should be negative, then we are ready to plot.

As heatmaps usually seem to turn out this is just a population map. This kind of validates that traffic fatalities happen where people live… which isn’t exactly shocking.

Person data

##  [1] "STATE"      "ST_CASE"    "VE_FORMS"   "VEH_NO"     "PER_NO"    
##  [6] "STR_VEH"    "COUNTY"     "DAY"        "MONTH"      "HOUR"      
## [11] "MINUTE"     "RUR_URB"    "FUNC_SYS"   "HARM_EV"    "MAN_COLL"  
## [16] "SCH_BUS"    "MAKE"       "MAK_MOD"    "BODY_TYP"   "MOD_YEAR"  
## [21] "TOW_VEH"    "SPEC_USE"   "EMER_USE"   "ROLLOVER"   "IMPACT1"   
## [26] "FIRE_EXP"   "AGE"        "SEX"        "PER_TYP"    "INJ_SEV"   
## [31] "SEAT_POS"   "REST_USE"   "REST_MIS"   "AIR_BAG"    "EJECTION"  
## [36] "EJ_PATH"    "EXTRICAT"   "DRINKING"   "ALC_DET"    "ALC_STATUS"
## [41] "ATST_TYP"   "ALC_RES"    "DRUGS"      "DRUG_DET"   "DSTATUS"   
## [46] "DRUGTST1"   "DRUGTST2"   "DRUGTST3"   "DRUGRES1"   "DRUGRES2"  
## [51] "DRUGRES3"   "HOSPITAL"   "DOA"        "DEATH_DA"   "DEATH_MO"  
## [56] "DEATH_YR"   "DEATH_HR"   "DEATH_MN"   "DEATH_TM"   "LAG_HRS"   
## [61] "LAG_MINS"   "P_SF1"      "P_SF2"      "P_SF3"      "WORK_INJ"  
## [66] "HISPANIC"   "RACE"       "LOCATION"   "SURVIVED"

Ideas to plot

Maps

Person Data: Age (possible to compair to age distribution of state?) Compair types of restraints and injury severity Drinking vs. Age Drugs vs. age Underage drunks Drugs by state Breakdowns of cycalist info Lag time from crash to death How offten are they at work * Is there something interesting here when combined with time of day? LOCATION for where non-motorists were durrint time of crash

## [1] 25